Delphi’s COVIDcast Project:
An Ecosystem for Tracking and Forecasting the Pandemic

Ryan Tibshirani
Statistics and Machine Learning
Carnegie Mellon University




January 21, 2021

Delphi Then

Delphi Now

COVIDcast Indicators

COVIDcast Ecosystem

What Can This Be Used For?

This Talk

I can’t cover all of this! I’ll focus on our API, and some basic demos with our survey data (please ask about medical claims data, or ask about forecasting or nowcasting, during the Q & A)

Outline:

  1. COVIDcast API
  2. Symptom survey
  3. Forecasting demo

Reproducible talk: all code included

Part 1: COVIDcast API

Part 1: COVIDcast API

COVIDcast API

The COVIDcast API is based on HTTP GET queries and returns data in JSON form. The base URL is https://api.covidcast.cmu.edu/epidata/api.php?source=covidcast

Parameter Description Examples
data_source data source doctor-visits or fb-survey
signal signal derived from data source smoothed_cli or smoothed_adj_cli
time_type temporal resolution of the signal day or week
geo_type spatial resolution of the signal county, hrr, msa, or state
time_values time units over which events happened 20200406 or 20200406-20200410
geo_value location codes, depending on geo_type * for all, or pa for Pennsylvania

R and Python Packages

We also provide R and Python packages for API access. Highlights:

(Have an idea? File an issue or contribute a PR on our public GitHub repo)

Example: Deaths

How many COVID-19 deaths have been reported per day, in my state, since March 1?

library(covidcast)
start_day = "2020-03-01"
end_day = "2021-01-16"
deaths = covidcast_signal(data_source = "usa-facts", 
                          signal = "deaths_7dav_incidence_num", 
                          start_day = start_day, end_day = end_day,
                          geo_type = "state", geo_values = "pa")

plot(deaths, plot_type = "line", 
     title = "New COVID-19 deaths in PA (7-day average)") + 
  scale_x_date(date_breaks = "1 month", date_labels = "%b") +
  theme(legend.position = "none")

Example: Hospitalizations

What percentage of daily hospital admissions are due to COVID-19 in PA, NY, TX?

hosp = covidcast_signal(data_source = "hospital-admissions", 
                        signal = "smoothed_adj_covid19_from_claims",
                        start_day = start_day, end_day = end_day,
                        geo_type = "state", geo_values = c("pa", "ny", "tx"))

plot(hosp, plot_type = "line", 
     title = "% of hospital admissions due to COVID-19") + 
  geom_dl(aes(y = value, color = geo_value, label = toupper(geo_value)), 
          method = "last.bumpup") +
  scale_x_date(date_breaks = "1 month", date_labels = "%b") +
  theme(legend.position = "none")

Example: New Cases

What does the current COVID-19 incident case rate look like, nationwide?

cases = covidcast_signal(data_source = "usa-facts", 
                         signal = "confirmed_7dav_incidence_prop",
                         start_day = end_day, end_day = end_day)

end_day_str = format.Date(end_day, "%B %d %Y")
plot(cases, title = "New COVID-19 cases per 100,000 people", range = c(0, 100), 
     choro_params = list(subtitle = end_day_str, legend_n = 6))

Example: Total Cases

What does the current COVID-19 cumulative case rate look like, nationwide?

cases = covidcast_signal(data_source = "usa-facts", 
                         signal = "confirmed_cumulative_prop",
                         start_day = end_day, end_day = end_day)

plot(cases, title = "Cumulative COVID-19 cases per 100,000 people", 
     range = c(0, 10000), 
     choro_params = list(subtitle = end_day_str, legend_n = 6))

Example: Doctor’s Visits

How do some cities compare in terms of doctor’s visits due to COVID-like illness?

dv = covidcast_signal(data_source = "doctor-visits", 
                      signal = "smoothed_adj_cli", 
                      start_day = start_day, end_day = end_day,
                      geo_type = "msa", 
                      geo_values = name_to_cbsa(c("Miami", "New York", 
                                                  "Pittsburgh", "San Antonio")))

plot(dv, plot_type = "line", 
     title = "% of doctor's visits due to COVID-like illness") + 
  scale_x_date(date_breaks = "1 month", date_labels = "%b") +
  scale_color_hue(labels = cbsa_to_name(unique(dv$geo_value)))

Example: Symptoms

How do my county and my friend’s county compare in terms of COVID symptoms?

sympt = covidcast_signal(data_source = "fb-survey", 
                         signal = "smoothed_hh_cmnty_cli",
                         start_day = "2020-04-15", end_day = end_day,
                         geo_values = c(name_to_fips("Allegheny"),
                                        name_to_fips("Fulton", 
                                                     state = "GA")))

plot(sympt, plot_type = "line", 
     title = "% of people who know somebody with COVID symptoms") + 
  scale_x_date(date_breaks = "1 month", date_labels = "%b") +
  scale_color_hue(labels = fips_to_name(unique(sympt$geo_value)))

Example: Mask Use

How do some states compare in terms of self-reported mask useage?

mask = covidcast_signal(data_source = "fb-survey", 
                        signal = "smoothed_wwearing_mask",
                        start_day = "2020-09-15", end_day = end_day,
                        geo_type = "state", 
                        geo_values = c("dc", "ma", "ny",
                                       "wy", "sd", "id"))

plot(mask, plot_type = "line", 
     title = "% of people who wear masks in public most/all the time") +
  geom_dl(aes(y = value, color = geo_value, label = toupper(geo_value)), 
          method = "last.bumpup") +
  scale_x_date(date_breaks = "1 month", date_labels = "%b") +
  theme(legend.position = "none")

As Of, Issues, Lag

By default the API returns the most recent data for each time_value. We also provide access to all previous versions of the data, using the following optional parameters:

Parameter To get data … Examples
as_of as if we queried the API on a particular date 20200406
issues published at a particular date or date range 20200406 or 20200406-20200410
lag published a certain number of time units after events occured 1 or 3

Data Revisions

Why would we need this? Because many data sources are subject to revisions:

This presents a challenge to modelers: e.g., we have to learn how to forecast based on the data we’d have at the time, not updates that would arrive later

To accommodate, we log revisions even when the original data source does not!

Example: Backfill in Doctor’s Visits

The last two weeks of August in CA …

# Let's get the data that was available as of 09/22, for the end of August in CA
dv = covidcast_signal(data_source = "doctor-visits", 
                      signal = "smoothed_adj_cli",
                      start_day = "2020-08-15", end_day = "2020-08-31",
                      geo_type = "state", geo_values = "ca",
                      as_of = "2020-09-21")

# Plot the time series curve
xlim = c(as.Date("2020-08-15"), as.Date("2020-09-21"))
ylim = c(3.83, 5.92)
ggplot(dv, aes(x = time_value, y = value)) + 
  geom_line() +
  coord_cartesian(xlim = xlim, ylim = ylim) +
  geom_vline(aes(xintercept = as.Date("2020-09-21")), lty = 2) +
  labs(color = "as of", x = "Date", y = "% doctor's visits due to CLI in CA") +
  theme_bw() + theme(legend.position = "bottom")

Example: Backfill in Doctor’s Visits (Cont.)

The last two weeks of August in CA …

# Now loop over a bunhch of "as of" dates, fetch data from the API for each one
as_ofs = seq(as.Date("2020-09-01"), as.Date("2020-09-21"), by = "week")
dv_as_of = map_dfr(as_ofs, function(as_of) {
  covidcast_signal(data_source = "doctor-visits", signal = "smoothed_adj_cli",
                   start_day = "2020-08-15", end_day = "2020-08-31", 
                   geo_type = "state", geo_values = "ca", as_of = as_of)
})

# Plot the time series curve "as of" September 1
dv_as_of %>% 
  filter(issue == as.Date("2020-09-01")) %>% 
  ggplot(aes(x = time_value, y = value)) + 
  geom_line(aes(color = factor(issue))) + 
  coord_cartesian(xlim = xlim, ylim = ylim) +
  geom_vline(aes(color = factor(issue), xintercept = issue), lty = 2) +
  labs(color = "as of", x = "Date", y = "% doctor's visits due to CLI in CA") +
  geom_line(data = dv, aes(x = time_value, y = value)) +
  geom_vline(aes(xintercept = as.Date("2020-09-21")), lty = 2) +
  theme_bw() + theme(legend.position = "none")

Example: Backfill in Doctor’s Visits (Cont.)

The last two weeks of August in CA …

dv_as_of %>% 
  ggplot(aes(x = time_value, y = value)) + 
  geom_line(aes(color = factor(issue))) + 
  coord_cartesian(xlim = xlim, ylim = ylim) +
  geom_vline(aes(color = factor(issue), xintercept = issue), lty = 2) +
  labs(color = "as of", x = "Date", y = "% doctor's visits due to CLI in CA") +
  geom_line(data = dv, aes(x = time_value, y = value)) +
  geom_vline(aes(xintercept = as.Date("2020-09-21")), lty = 2) +
  theme_bw() + theme(legend.position = "none")

Part 2: Symptom Surveys

Part 2: Symptom Surveys

Massive Symptom Survey

Through recruitment partnership with Facebook, we survey about 50,000 people daily (and over 16 million since it began in April), in the US. Topics include:

A parallel, international effort by the University of Maryland reaches 100+ countries in 55 languages

Massive Symptom Survey (Cont.)

This is the largest non-Census research survey ever conducted (that we know of). Raw response data is freely available to researchers who sign a data use agreement

COVID-Like Illness

Using the survey data we generate daily, county-level estimates of:

(Note that COVID-like illness or CLI is defined as fever of at least 100 °F, along with cough, shortness of breath, or difficulty breathing. We also ask people to report on more rare symptoms)

Why % CLI-in-Community?

Why ask a proxy question (have people report on others)? Here’s Spearman correlations to COVID-19 case rates:

# Fetch Facebook % CLI signal, % CLI-in-community signal and confirmed case
# incidence proportions
start_day = "2020-04-15"
end_day = "2021-01-16"
sympt1 = covidcast_signal("fb-survey", "smoothed_cli", 
                          start_day, end_day)
sympt2 = covidcast_signal("fb-survey", "smoothed_hh_cmnty_cli", 
                          start_day, end_day)
cases = covidcast_signal("usa-facts", "confirmed_7dav_incidence_prop", 
                         start_day, end_day)

# Consider only counties with at least 500 cumulative cases so far
case_num = 500
geo_values = covidcast_signal("usa-facts", "confirmed_cumulative_num",
                              max(cases$time), max(cases$time)) %>%
  filter(value >= case_num) %>% pull(geo_value)
sympt1_act = sympt1 %>% filter(geo_value %in% geo_values)
sympt2_act = sympt2 %>% filter(geo_value %in% geo_values)
cases_act = cases %>% filter(geo_value %in% geo_values)

# Compute correlations, per time, over all counties
df_cor1 = covidcast_cor(sympt1_act, cases_act, by = "time_value", 
                        method = "spearman")
df_cor2 = covidcast_cor(sympt2_act, cases_act, by = "time_value", 
                        method = "spearman")

# Stack rowwise into one data frame
df_cor = rbind(df_cor1, df_cor2)
df_cor$signal = c(rep("% CLI", nrow(df_cor1)), 
                  rep("% CLI-in-community", nrow(df_cor2)))

# Then plot correlations over time 
ggplot(df_cor, aes(x = time_value, y = value)) + 
  geom_line(aes(color = signal)) +
  labs(title = "Correlation between survey signals and case rates (by time)",
       subtitle = sprintf("Over all counties with at least %i cumulative cases",
                          case_num), x = "Date", y = "Correlation") +
    theme_bw() + theme(legend.pos = "bottom", legend.title = element_blank())

Beyond Symptom Data

Reminder: survey data extends far beyond symptoms. For example:

Example: Vaccine Acceptance

If a COVID-19 vaccine were offered to you today, would you definitely or probably get vaccinated?

start_day = "2021-01-18"
end_day = "2021-01-18"
vaccine = covidcast_signal(data_source = "fb-survey",
                           signal = "smoothed_waccept_covid_vaccine",
                           start_day, end_day, geo_type = "state")

plot(vaccine, title = "% of people who would accept COVID-19 vaccine",
     range = c(50, 85), choro_col = c("#D9F0C2", "#BFE6B5", "#1F589F"),
     choro_params = list(subtitle = format.Date(end_day, "%B %d %Y")))

Part 3: Forecasting Demo

Part 3: Forecasting Demo

An Early Indicator?

As motivation, let’s take a look at case counts in Miami-Dade, right around the second wave:

# Fetch Facebook % CLI-in-community signal and confirmed case incidence numbers
# from June 1 to July 15
start_day = "2020-06-01"
end_day = "2020-07-15"
sympt = covidcast_signal("fb-survey", "smoothed_hh_cmnty_cli", 
                         start_day, end_day)
cases = covidcast_signal("usa-facts", "confirmed_7dav_incidence_num",
                         start_day, end_day)

# Function to transform from one range to another
trans = function(x, from_range, to_range) {
  (x - from_range[1]) / (from_range[2] - from_range[1]) *
    (to_range[2] - to_range[1]) + to_range[1]
}

# Function to produce a plot comparing the signals for one county
ggplot_colors = c("#FC4E07", "#00AFBB", "#E7B800")
plot_one = function(geo_value, df1, df2, lab1, lab2, title = NULL, 
                    xlab = NULL, ylab1 = NULL, ylab2 = NULL) {
  # Filter down the signal data frames
  given_geo_value = geo_value
  df1 = df1 %>% filter(geo_value == given_geo_value)
  df2 = df2 %>% filter(geo_value == given_geo_value)
  
  # Compute ranges of the two signals
  range1 = df2 %>% select("value") %>% range(na.rm = TRUE)
  range2 = df1 %>% select("value") %>% range(na.rm = TRUE)
  
  # Convenience functions for our two signal ranges
  trans12 = function(x) trans(x, range1, range2)
  trans21 = function(x) trans(x, range2, range1)
  
  # Find state name, find abbreviation, then set title
  state_name = fips_to_name(paste0(substr(geo_value, 1, 2), "000"))
  state_abbr = name_to_abbr(state_name)
  title = paste0(fips_to_name(geo_value), ", ", state_abbr)
  
  # Transform the combined signal to the incidence range, then stack
  # these rowwise into one data frame
  df = select(rbind(df1 %>% mutate_at("value", trans21),
                    df2), c("time_value", "value"))
  df$signal = c(rep(lab1, nrow(df1)), rep(lab2, nrow(df2)))
  
  # Finally, plot both signals
  return(ggplot(df, aes(x = time_value, y = value)) +
           geom_line(aes(color = signal)) +
           scale_color_manual(values = ggplot_colors[1:2]) +
           scale_y_continuous(name = ylab1, limits = range1,
                              sec.axis = sec_axis(trans = trans12,
                                                  name = ylab2)) +
           labs(title = title, x = xlab) + theme_bw() +
           theme(legend.pos = "bottom", legend.title = element_blank()))
}

# Produce a plot for Miami-Dade, and add vertical lines
plot_one(name_to_fips("Miami-Dade"), df1 = sympt, df2 = cases, 
         lab1 = "% CLI-in-community", lab2 = "New COVID-19 cases", 
         xlab = "Date", ylab1 = "New COVID-19 cases",
         ylab2 = "% of people who know someone with CLI") +
  geom_vline(xintercept = as.numeric(as.Date("2020-06-19")),
             linetype = 2, size = 1, color = ggplot_colors[1]) +
  geom_vline(xintercept = as.numeric(as.Date("2020-06-25")),
             linetype = 2, size = 1, color = ggplot_colors[2])

An Early Indicator? (Cont.)

Let’s look again, now at Allegheny County, right around the third wave:

start_day = "2020-10-15"
end_day = "2020-12-01"
sympt = covidcast_signal("fb-survey", "smoothed_hh_cmnty_cli", 
                         start_day, end_day)
cases = covidcast_signal("usa-facts", "confirmed_7dav_incidence_num",
                         start_day, end_day)

# Produce a plot for Allegheny County, and add vertical lines
plot_one(name_to_fips("Allegheny"), df1 = sympt, df2 = cases,
         lab1 = "% CLI-in-community", lab2 = "New COVID-19 cases", 
         xlab = "Date", ylab1 = "New COVID-19 cases",
         ylab2 = "% of people who know someone with CLI") +
  geom_vline(xintercept = as.numeric(as.Date("2020-10-30")),
             linetype = 2, size = 1, color = ggplot_colors[1]) +
  geom_vline(xintercept = as.numeric(as.Date("2020-11-06")),
             linetype = 2, size = 1, color = ggplot_colors[2])

Simple Forecasting Demo

Notational setup: for location \(\ell\) and time \(t\), let

To predict case rates \(d\) days ahead, consider two simple autoregressive models: \[ \begin{align*} \mathrm{Quantile}_\tau(Y_{\ell,t+d} \,|\, \{Y_s,X_s:s\leq t\}) &= \alpha_\tau + \sum_{j=0}^2 \beta_{\tau,j} Y_{\ell,t-7j} \\ \mathrm{Quantile}_\tau(Y_{\ell,t+d} \,|\, \{Y_s,X_s:s\leq t\}) &= \alpha_\tau + \sum_{j=0}^2 \beta_{\tau,j} Y_{\ell,t-7j} + \sum_{j=0}^2 \beta_{\tau,j} X_{\ell,t-7j}\\ \end{align*} \]

Simple Forecasting Demo (Cont.)

Simple Forecasting Demo (Cont.)

Results from state-level forecasts made over early July 1 to August 15 (from modeltools vignette):

# http://github.com/cmu-delphi/covidcast/blob/main/R-packages/modeltools/vignettes/quantgen-forecast.rda
load("quantgen-forecast.rda")

# Compute and plot scaled mean WIS as function of number of days ahead
evals %>%
  group_by(forecaster, ahead) %>%
  summarize(wis = mean(wis, na.rm = TRUE)) %>%
  pivot_wider(names_from = "forecaster", values_from = "wis") %>%
  mutate(QAR3 = QAR3 / Baseline, `QAR3 + CLI3` = `QAR3 + CLI3`/ Baseline) %>%
  select(-Baseline) %>%
  pivot_longer(cols = -ahead, names_to = "forecaster", values_to = "wis") %>%
  ggplot(aes(x = ahead, y = wis)) +  
  geom_line(aes(color = forecaster)) + 
  geom_point(aes(color = forecaster)) +
  scale_color_manual(values = ggplot_colors[2:1]) +
  labs(x = "Number of days ahead", y = "Scaled mean WIS") +
  theme_bw() + theme(legend.pos = "bottom", legend.title = element_blank())

Wrapping Up

Delphi’s COVIDcast ecosystem has many parts:

  1. Unique relationships with partners in tech and healthcare granting us access to data on pandemic activity
  2. Code and infrastructure to build COVID-19 indicators, continuously-updated and geographically-comprehensive
  3. A historical database of all indicators, including revision tracking
  4. A public API (and R and Python packages) serving new indicators daily
  5. Interactive maps and graphics to display our indicators
  6. Nowcasting and forecasting work building on the indicators

In this pandemic, it’ll take an entire community to find answers to all the important questions. Please join ours!

Thanks

For more, visit https://covidcast.cmu.edu (you’ll find everything linked from there)


Delphi Carnegie Mellon University

Appendix

Appendix

List of Currently Available Indicators

meta = covidcast_meta() %>%
  group_by(data_source, signal) %>%
  summarize(county = ifelse("county" %in% geo_type, "*", ""),
            msa = ifelse("msa" %in% geo_type, "*", ""),
            dma = ifelse("dma" %in% geo_type, "*", ""),
            hrr = ifelse("hrr" %in% geo_type, "*", ""),
            state = ifelse("state" %in% geo_type, "*", ""),
            nation = ifelse("nation" %in% geo_type, "*", "")) %>%
  mutate(signal = ifelse(nchar(signal) <= 25, signal,
                         paste0(substr(signal, 1, 22), "..."))) %>%
  as.data.frame() %>%
  print(right = FALSE, row.names = FALSE)
##  data_source           signal                    county msa dma hrr state nation
##  chng                  smoothed_adj_outpatien... *      *       *   *     *     
##  chng                  smoothed_adj_outpatien... *      *       *   *     *     
##  chng                  smoothed_outpatient_cli   *      *       *   *     *     
##  chng                  smoothed_outpatient_covid *      *       *   *     *     
##  doctor-visits         smoothed_adj_cli          *      *       *   *           
##  doctor-visits         smoothed_cli              *      *       *   *           
##  fb-survey             raw_cli                   *      *       *   *     *     
##  fb-survey             raw_hh_cmnty_cli          *      *       *   *     *     
##  fb-survey             raw_ili                   *      *       *   *     *     
##  fb-survey             raw_nohh_cmnty_cli        *      *       *   *     *     
##  fb-survey             raw_wcli                  *      *       *   *     *     
##  fb-survey             raw_whh_cmnty_cli         *      *       *   *     *     
##  fb-survey             raw_wili                  *      *       *   *     *     
##  fb-survey             raw_wnohh_cmnty_cli       *      *       *   *     *     
##  fb-survey             smoothed_accept_covid_... *      *       *   *     *     
##  fb-survey             smoothed_anxious_5d       *      *       *   *     *     
##  fb-survey             smoothed_cli              *      *       *   *     *     
##  fb-survey             smoothed_covid_vaccinated *      *       *   *     *     
##  fb-survey             smoothed_depressed_5d     *      *       *   *     *     
##  fb-survey             smoothed_felt_isolated_5d *      *       *   *     *     
##  fb-survey             smoothed_hh_cmnty_cli     *      *       *   *     *     
##  fb-survey             smoothed_ili              *      *       *   *     *     
##  fb-survey             smoothed_large_event_1d   *      *       *   *     *     
##  fb-survey             smoothed_nohh_cmnty_cli   *      *       *   *     *     
##  fb-survey             smoothed_others_masked    *      *       *   *     *     
##  fb-survey             smoothed_public_transi... *      *       *   *     *     
##  fb-survey             smoothed_restaurant_1d    *      *       *   *     *     
##  fb-survey             smoothed_shop_1d          *      *       *   *     *     
##  fb-survey             smoothed_spent_time_1d    *      *       *   *     *     
##  fb-survey             smoothed_tested_14d       *      *       *   *     *     
##  fb-survey             smoothed_tested_positi... *      *       *   *     *     
##  fb-survey             smoothed_travel_outsid... *      *       *   *     *     
##  fb-survey             smoothed_vaccine_likel...                          *     
##  fb-survey             smoothed_vaccine_likel...                          *     
##  fb-survey             smoothed_vaccine_likel...                          *     
##  fb-survey             smoothed_vaccine_likel...                          *     
##  fb-survey             smoothed_vaccine_likel...                          *     
##  fb-survey             smoothed_waccept_covid... *      *       *   *     *     
##  fb-survey             smoothed_wanted_test_14d  *      *       *   *     *     
##  fb-survey             smoothed_wanxious_5d      *      *       *   *     *     
##  fb-survey             smoothed_wcli             *      *       *   *     *     
##  fb-survey             smoothed_wcovid_vaccin... *      *       *   *     *     
##  fb-survey             smoothed_wdepressed_5d    *      *       *   *     *     
##  fb-survey             smoothed_wearing_mask     *      *       *   *     *     
##  fb-survey             smoothed_wfelt_isolate... *      *       *   *     *     
##  fb-survey             smoothed_whh_cmnty_cli    *      *       *   *     *     
##  fb-survey             smoothed_wili             *      *       *   *     *     
##  fb-survey             smoothed_wlarge_event_1d  *      *       *   *     *     
##  fb-survey             smoothed_wnohh_cmnty_cli  *      *       *   *     *     
##  fb-survey             smoothed_work_outside_... *      *       *   *     *     
##  fb-survey             smoothed_worried_becom... *      *       *   *     *     
##  fb-survey             smoothed_worried_finances *      *       *   *     *     
##  fb-survey             smoothed_wothers_masked   *      *       *   *     *     
##  fb-survey             smoothed_wpublic_trans... *      *       *   *     *     
##  fb-survey             smoothed_wrestaurant_1d   *      *       *   *     *     
##  fb-survey             smoothed_wshop_1d         *      *       *   *     *     
##  fb-survey             smoothed_wspent_time_1d   *      *       *   *     *     
##  fb-survey             smoothed_wtested_14d      *      *       *   *     *     
##  fb-survey             smoothed_wtested_posit... *      *       *   *     *     
##  fb-survey             smoothed_wtravel_outsi... *      *       *   *     *     
##  fb-survey             smoothed_wvaccine_like...                          *     
##  fb-survey             smoothed_wvaccine_like...                          *     
##  fb-survey             smoothed_wvaccine_like...                          *     
##  fb-survey             smoothed_wvaccine_like...                          *     
##  fb-survey             smoothed_wvaccine_like...                          *     
##  fb-survey             smoothed_wwanted_test_14d *      *       *   *     *     
##  fb-survey             smoothed_wwearing_mask    *      *       *   *     *     
##  fb-survey             smoothed_wwork_outside... *      *       *   *     *     
##  fb-survey             smoothed_wworried_beco... *      *       *   *     *     
##  fb-survey             smoothed_wworried_fina... *      *       *   *     *     
##  ght                   raw_search                       *   *   *   *           
##  ght                   smoothed_search                  *   *   *   *           
##  google-survey         raw_cli                   *      *       *   *           
##  google-survey         smoothed_cli              *      *       *   *           
##  google-symptoms       ageusia_raw_search        *      *       *   *     *     
##  google-symptoms       ageusia_smoothed_search   *      *       *   *     *     
##  google-symptoms       anosmia_raw_search        *      *       *   *     *     
##  google-symptoms       anosmia_smoothed_search   *      *       *   *     *     
##  google-symptoms       sum_anosmia_ageusia_ra... *      *       *   *     *     
##  google-symptoms       sum_anosmia_ageusia_sm... *      *       *   *     *     
##  hospital-admissions   smoothed_adj_covid19      *      *       *   *           
##  hospital-admissions   smoothed_adj_covid19_f... *      *       *   *     *     
##  hospital-admissions   smoothed_covid19          *      *       *   *           
##  hospital-admissions   smoothed_covid19_from_... *      *       *   *     *     
##  indicator-combination confirmed_7dav_cumulat... *      *       *   *     *     
##  indicator-combination confirmed_7dav_cumulat... *      *       *   *     *     
##  indicator-combination confirmed_7dav_inciden... *      *       *   *     *     
##  indicator-combination confirmed_7dav_inciden... *      *       *   *     *     
##  indicator-combination confirmed_cumulative_num  *      *       *   *     *     
##  indicator-combination confirmed_cumulative_prop *      *       *   *     *     
##  indicator-combination confirmed_incidence_num   *      *       *   *     *     
##  indicator-combination confirmed_incidence_prop  *      *       *   *     *     
##  indicator-combination deaths_7dav_cumulative... *      *       *   *     *     
##  indicator-combination deaths_7dav_cumulative... *      *       *   *     *     
##  indicator-combination deaths_7dav_incidence_num *      *       *   *     *     
##  indicator-combination deaths_7dav_incidence_... *      *       *   *     *     
##  indicator-combination deaths_cumulative_num     *      *       *   *     *     
##  indicator-combination deaths_cumulative_prop    *      *       *   *     *     
##  indicator-combination deaths_incidence_num      *      *       *   *     *     
##  indicator-combination deaths_incidence_prop     *      *       *   *     *     
##  indicator-combination nmf_day_doc_fbc_fbs_ght   *      *           *           
##  indicator-combination nmf_day_doc_fbs_ght       *      *           *           
##  jhu-csse              confirmed_7dav_cumulat... *      *       *   *     *     
##  jhu-csse              confirmed_7dav_cumulat... *      *       *   *     *     
##  jhu-csse              confirmed_7dav_inciden... *      *       *   *     *     
##  jhu-csse              confirmed_7dav_inciden... *      *       *   *     *     
##  jhu-csse              confirmed_cumulative_num  *      *       *   *     *     
##  jhu-csse              confirmed_cumulative_prop *      *       *   *     *     
##  jhu-csse              confirmed_incidence_num   *      *       *   *     *     
##  jhu-csse              confirmed_incidence_prop  *      *       *   *     *     
##  jhu-csse              deaths_7dav_cumulative... *      *       *   *     *     
##  jhu-csse              deaths_7dav_cumulative... *      *       *   *     *     
##  jhu-csse              deaths_7dav_incidence_num *      *       *   *     *     
##  jhu-csse              deaths_7dav_incidence_... *      *       *   *     *     
##  jhu-csse              deaths_cumulative_num     *      *       *   *     *     
##  jhu-csse              deaths_cumulative_prop    *      *       *   *     *     
##  jhu-csse              deaths_incidence_num      *      *       *   *     *     
##  jhu-csse              deaths_incidence_prop     *      *       *   *     *     
##  nchs-mortality        deaths_allcause_incide...                    *           
##  nchs-mortality        deaths_allcause_incide...                    *           
##  nchs-mortality        deaths_covid_and_pneum...                    *           
##  nchs-mortality        deaths_covid_and_pneum...                    *           
##  nchs-mortality        deaths_covid_incidence...                    *           
##  nchs-mortality        deaths_covid_incidence...                    *           
##  nchs-mortality        deaths_flu_incidence_num                     *           
##  nchs-mortality        deaths_flu_incidence_prop                    *           
##  nchs-mortality        deaths_percent_of_expe...                    *           
##  nchs-mortality        deaths_pneumonia_notfl...                    *           
##  nchs-mortality        deaths_pneumonia_notfl...                    *           
##  nchs-mortality        deaths_pneumonia_or_fl...                    *           
##  nchs-mortality        deaths_pneumonia_or_fl...                    *           
##  quidel                covid_ag_raw_pct_positive *      *       *   *           
##  quidel                covid_ag_smoothed_pct_... *      *       *   *           
##  quidel                raw_pct_negative                 *           *           
##  quidel                raw_tests_per_device             *           *           
##  quidel                smoothed_pct_negative            *           *           
##  quidel                smoothed_tests_per_device        *           *           
##  safegraph             bars_visit_num            *      *       *   *     *     
##  safegraph             bars_visit_prop           *      *       *   *     *     
##  safegraph             completely_home_prop      *      *       *   *     *     
##  safegraph             completely_home_prop_7dav *      *       *   *     *     
##  safegraph             full_time_work_prop       *      *       *   *     *     
##  safegraph             full_time_work_prop_7dav  *      *       *   *     *     
##  safegraph             median_home_dwell_time    *      *       *   *     *     
##  safegraph             median_home_dwell_time... *      *       *   *     *     
##  safegraph             part_time_work_prop       *      *       *   *     *     
##  safegraph             part_time_work_prop_7dav  *      *       *   *     *     
##  safegraph             restaurants_visit_num     *      *       *   *     *     
##  safegraph             restaurants_visit_prop    *      *       *   *     *     
##  usa-facts             confirmed_7dav_cumulat... *      *       *   *           
##  usa-facts             confirmed_7dav_cumulat... *      *       *   *           
##  usa-facts             confirmed_7dav_inciden... *      *       *   *           
##  usa-facts             confirmed_7dav_inciden... *      *       *   *           
##  usa-facts             confirmed_cumulative_num  *      *       *   *           
##  usa-facts             confirmed_cumulative_prop *      *       *   *           
##  usa-facts             confirmed_incidence_num   *      *       *   *           
##  usa-facts             confirmed_incidence_prop  *      *       *   *           
##  usa-facts             deaths_7dav_cumulative... *      *       *   *           
##  usa-facts             deaths_7dav_cumulative... *      *       *   *           
##  usa-facts             deaths_7dav_incidence_num *      *       *   *           
##  usa-facts             deaths_7dav_incidence_... *      *       *   *           
##  usa-facts             deaths_cumulative_num     *      *       *   *           
##  usa-facts             deaths_cumulative_prop    *      *       *   *           
##  usa-facts             deaths_incidence_num      *      *       *   *           
##  usa-facts             deaths_incidence_prop     *      *       *   *           
##  youtube-survey        raw_cli                                      *           
##  youtube-survey        raw_ili                                      *           
##  youtube-survey        smoothed_cli                                 *           
##  youtube-survey        smoothed_ili                                 *

Other Ways To Explore the Indicators

Other Ways To Explore the Survey

Medical Insurance Claims

Through partnership with Change Healthcare and others, we compute daily, county-level aggregate statistics from medical insurance claims covering over half of the US population

Despite challenges, these data have enormous potential for nowcasting. Why? Can help overcome major issues with public health reporting: limited testing capacity, reporting artifacts, retroactive re-definitions …


And can help overcome by far the biggest problem: misleading/misinterpreted aggregation rules, e.g., cases are timestamped based on report date, not test date!